Topic models jointly learn topics and document-level topic distribution.Extrinsic evaluation of topic models tends to focus exclusively on topic-levelevaluation, e.g. by assessing the coherence of topics. We demonstrate thatthere can be large discrepancies between topic- and document-level modelquality, and that basing model evaluation on topic-level analysis can be highlymisleading. We propose a method for automatically predicting topic modelquality based on analysis of document-level topic allocations, and provideempirical evidence for its robustness.
展开▼